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Abstract. We present a simple model reproducing the long-range autocorrelations and the power 
spectrum of the web traffic. The model assumes the traffic as Poisson flow of files with size dis- 
tributed according to the power-law. In this model the long-range autocorrelations are independent 
of the network properties as well as of inter-packet time distribution. 

INTRODUCTION 

The power spectra of large variety of complex systems exhibit l/f behavior at low 
frequencies. It is widely accepted that l/f noise and self-similarity are characteristic 
signatures of complexity yj,|2|]. Studies of network traffic and especially of Internet 
traffic prove the close relation of self-similarity and complexity. Nevertheless, there 
is no evidence whether this complexity arises from the computer network or from the 
computer file statistics. We already have proposed a few stochastic point process models 
exhibiting self-similarity and l/f noise H 0. & EJ. The signal in these models is a 
sequence of pulses or events. In the case of 5-type pulses (point process) the signal is 
defined by the stochastic process of the interevent time Jo]. We have shown that the 
Brownian motion of interevent time of the signal pulses Jj] or more general stochastic 
fluctuations described by multiplicative Langevin equation are responsible for the l/f 
noise of the model signal [5]. It looks very natural to model computer network traffic 
exhibiting self-similarity by such stochastic point process signal. In case of success it 
would mean that self- similar behavior is induced by the stochastic arrival of requests 
from the network. Another possibility is to consider that the self-similar behavior is 
induced by the server statistics, rather than by the arrival process. The empirical analysis 
of the computer network traffic provides an evidence that the second possibility is more 
realistic IlJ. This imposed us to model the computer network traffic by Poisson sequence 
of pulses with stochastic duration. We recently showed that under suitable choice of the 
pulse duration statistics such a signal exhibited l/f noise [6]. 

In this contribution we provide the analytical and numerical results consistent with 
the empirical data and confirming that self- similar behavior of the computer network 
traffics is related with the power-law distribution of files transferred in the network. 



SIGNAL AS A SEQUENCE OF PULSES 



We will investigate a signal consisting of a sequence of pulses. We assume that: 

1. the pulse sequences are stationary and ergodic; 

2. interevent times and the shapes of different pulses are independent. 

The general form of such signal can be written as 

I(t) = J^A k (t-t k ) (1) 

k 

where the functions A k (t) determine the shape of the individual pulses and the time 
moments t k determine when the pulses occur. Time moments t k are not correlated 
with the shape of the pulse A k and the interevent times l k = t k — fjt-i are random and 
uncorrected. The occurrence times of the pulses t k are distributed according to Poisson 
process. 

The power spectrum is given by the equation 
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I(t)e- i27lfl dt 



(2) 



where T = tf — t\ and the brackets (...) denote the averaging over realizations of the 
process. The power spectral density of a random pulse train is given by Carson's theorem 

S(f) = 2v(\F k (co)\ 2 ), co = 2nf (3) 

where 
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A k (u)e mu du. (4) 
-oo 

is the Fourier transform of the pulse A k and 

/N+l\ 

v =&(— ) (5) 

is the mean number of pulses per unit time. Here iV = k max — k m [ n is the number of pulses. 



Pulses of variable duration 

Let the only random parameter of the pulse is the duration. We take the form of the 
pulse as 

Mt) = T k P A (£) , (6) 

where T k is the characteristic duration of the pulse. The value j3 = corresponds to 
the fixed height pulses; J3 = — 1 corresponds to constant area pulses. Differentiating the 



fixed area pulses we obtain /3 = —2. The Fourier transform of the pulse © is 
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4 A - e iw dt = r/ +1 / A(u)e iaT *du = T k P+l F((DT k ). 
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From Eq. © the power spectrum is 

S{f)=2v(T^ +2 \F{coT k )\ 2 ). (7) 
Introducing the probability density P{Ti) of the pulses durations T k we can write 

poo 

S(f) = 2v J T 2p+2 \F{coT k )\ 2 P{T k )dT k . (8) 

If P(T k ) is a power-law distribution, then the expressions for the spectrum are similar for 
all/3. ' 

Power-law distribution 

We take the power-law distribution of the pulse durations 
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0, othervise. 

From Eq. © we have the spectrum 
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When a > — 1 and <C ft) <C ^J— then the expression for the spectrum can be 

'max J min 

approximated as 

Ifa + 2/3+2 = then in the frequency domain jX- <C ft) <C ^— the spectrum is 
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S(f)* \a + U L \ F (u)\ 2 du. (11) 



Therefore, we obtained 1/ / spectrum. The condition a + 2/3 +2 = is satisfied, e.g., 
for the fixed area pulses (/3 = —1) and uniform distribution of pulse durations (a = 0) 
or for fixed height pulses (/3 = 0) and uniform distribution of inverse durations y=T k , 
i.e. for P(T k ) - T- 2 . 



Rectangular pulses 



We will obtain the spectrum of the rectangular fixed height pulses (/3 =0). The height 
of the pulse is a and the duration is 7^. The Fourier transform of the pulse is 

l . _ .a* 2 sinfe^l 



F(C0T k )=a due mTkU = a— = ae 1 ^ ^ J -. (12) 
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Then the spectrum according to Eqs. ®, © and (fT2l is 
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where r(a,z) is the incomplete gamma function, T(a,z) = JT ' u a ~ l e~ u du. 

For a = —2 we have the uniform distribution of inverse durations. The term with 
r(oc + 1, icoT max ) is small and can be neglected. We also assume that T m [ n <C T max and 
(t 

neglect the term ) . Then we obtain 1/f spectrum 
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s(f) « — r min . (14) 

Further we will investigate how the variable duration of the pulses is related with the 
variable size of files transferred in the computer networks. 



MODELING COMPUTER NETWORK TRAFFIC BY SEQUENCE 

OF PULSES 

In this section we provide numerical simulation results of the computer network traf- 
fic based on the description of signals as uncorrected sequence of variable size web 
requests. We model empirical data of incoming web traffic publicly available on the In- 
ternet [ 8] . Our assumptions are closely related with the model description in the previous 
section, with the empirical data and analysis provided in Ref. [7]. First of all from Eq. 
(THT) it is clear that the sequence of requests distributed as power law © for a = —2 
yields 1/f spectrum, as observed in the empirical data J8J]. For the numerical calcula- 
tions we use the positive Cauchy distribution instead 

P(x) = -^-y (15) 

which better approximates the empirical request size distribution [8]. Where s = 4100 
bytes is empirical parameter of distribution and x is a stochastic size of the file requests in 
bytes. Requested files arrive as Poisson sequence with mean inter- arrival time xy = 0. 101 
seconds. The files arrive divided by the network protocol into n p = x/1500 packets. 
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FIGURE 1. Power spectral density S(f) versus frequency / calculated numerically according to Eq. (|2}: 
a) from empirical incoming web traffic presented in fH; b) from numerically simulated traffic dividing 
files into Poisson sequence of packets with s = 4100, T/ = 0.101, t p = 11.6 x 10~ 6 x 10 £ , where £ is 
a random variable equally distributed in the interval [0,3]. Straight lines represent theoretical prediction 
i ll-It with empirical parameters according to Eq. dl6l . 



According to our assumption these packets spread into another Poisson sequence with 
mean inter-packet time % p . The total incoming web traffic is a sequence of packets 
resulting from all requests. This procedure reconstructs the described Poisson sequence 
of variable duration pulses into self-similar point process modeling traffic of packets. 
Our numerical results confirm that the spectral properties of the packet traffic are defined 
by the Poisson sequence of variable duration and are independent of file division into 
packets. It is natural to expect that mean inter-packet time x p depends on the position 
of computer on the network from which the file is requested. Consequently, the inter- 
packet time distribution measured from the empirical histogram or calculated in this 
model must depend on the computer network structure when the spectral properties and 
autocorrelation of the signal are defined by the file size statistics independent of network 
properties. Our numerical simulation of the web incoming traffic and its power spectrum, 
presented on Fig. 1, confirm that the flow of packets exhibits 1/ / noise and long-range 
autocorrelation induced by the power law (positive Cauchy) distribution of transferred 
files. Both empirical and simulated spectrum are in good agreement with theoretical 
prediction (fl4h . which we rewrite with empirical parameters of the model as: 

Where p = 1500 is a standard packet size in bytes and T p>max = 11.6 x 10 3 is a 
maximum inter-packet time. 

In Fig. 2. we present the empirical and numerically simulated histograms of the inter- 
packet time x p We assume a very simple model to reproduce empirical distribution 
of the packet arrivals. Files arrive divided into packets with inter-packet time x p = 
11.6 x 10 £ , where e is a random variable equally distributed in the interval [0,3]. 
This assumption reproduces empirical distribution of the inter-packet time pretty well, 
as seen in Fig. 2. 
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FIGURE 2. Inter-packet time histograms: a) empirical data of web incoming traffic |8]; b) numerical 
simulation with same parameters as in Fig.l. 

CONCLUSIONS 

In this contribution we present a very simple model reproducing the long-range auto- 
correlations and power spectrum of the web traffic. The model assumes the traffic as 
Poisson flow of files distributed according to the power-law. In this model the long- 
range autocorrelations are independent of the network properties and of the inter-packet 
time distribution. We reproduced the inter-packet time distribution of incoming web traf- 
fic assuming that files arrive as Poisson sequence with mean inter-packet time equally 
distributed in a logarithmic scale. This simple model may be applicable to the other 
computer networks as well. 
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